Submit two files that are labeled: YourLastName_Assignment_3 that have the following formats:
When programming in Python, refer to Kenneth Reitz’ PEP 8: The Style Guide for Python Code: http://pep8.org/ (Links to an external site.)Links to an external site. There is the Google style guide for Python at https://google.github.io/styleguide/pyguide.html (Links to an external site.)Links to an external site. Comment often and in detail.
In his first state of the uniion address , president Trump mentioned Chicago violance 10 times Trump's State of the Union Address
Columnist Clarence Page wrote an article , published by the Chicago Tribune stated that the city of Chicago had more homicides in the past two years than New York and Los Angeles combined
The CSV file for crimes dataset for the city of Chicago is obtained from the data portal for the city of Chicago. Here is the link for the city of Chicago data portal City of Chicago Data Portal
Three set of data are need for this assignment:
Complete description of the dataset can be found on Chicago city data portal.
Based on Trumps State of the Uniion Address and the article written by columnist Clarence Page and published by the Chicago Tribune, we are interested to retrieve the data for the past two years and perform different types of spatial queries.
There are few of these queries that we are interested in to help CPD and city of Chicago to plot on a Choroplteh map those districts that have highest gun crimes.
Here are examples of those types of queries:
Packages you need to Connect PostgreSQL server to load and retrieve Crhicago Crime dataset from the database:
Since we are using PostGIS in our work, please read and bookmark Chapter 4. Using PostGIS: Data Management and Queries
# Install a pip package in the current Jupyter kernel
import sys
!{sys.executable} -m pip install psycopg2-binary
# Install a pip package in the current Jupyter kernel
import sys
!{sys.executable} -m pip install area
# Install a pip package in the current Jupyter kernel
import sys
!{sys.executable} -m pip install folium
import folium
from folium import plugins
from folium.plugins import MarkerCluster
import psycopg2
import csv
import pandas as pd
import json
from area import area
from psycopg2.extensions import ISOLATION_LEVEL_AUTOCOMMIT
from IPython.display import display
folium.__version__
psycopg2.__version__
# Use this data set for your final submission of your Assignment 3
# Uncomment the following line after you unit test your code and ready to run and submit
#your assignment on this dataset
# db_connection = psycopg2.connect(host='129.105.248.26', dbname="chicago_crimes", user="YourNetID")
# Use the following dataset for unit testing purposes only.
# Comment the following line when you are done with your unit testing and ready to run your assignment
#on the complete dataset and submit your Assignment
db_connection = psycopg2.connect(host='129.105.248.26', dbname="chicago_crimes_ut", user="kjq978")
cursor = db_connection.cursor()
# Uncomment and run this code when your transactions get "stuck"
#db_connection.rollback()
The Crimes_2001_to_present.csv is downloaded from Chicago data portal and it has roughly 6.5 million records.
While working on this dataset, It is prudent to make a note of the following:
Sort algorithms used by the database engines vary in performance between O($N log N$) and O($ N^{2} $) where $N$ is the size of the number
Search algorithms used by the database engines vary in performance between O($log N$) and O($ N $) where $N$ is the size of the number
# Get the column names for table crimes
cursor.execute("SELECT column_name \
FROM INFORMATION_SCHEMA.COLUMNS \
WHERE table_name = 'crimes';")
rows=cursor.fetchall()
rows
# Get the column names for table police_stations
cursor.execute("SELECT column_name \
FROM INFORMATION_SCHEMA.COLUMNS \
WHERE table_name = 'police_stations';")
rows=cursor.fetchall()
rows
# Get the first five rows of the police station table
query = "SELECT * \
FROM police_stations"
stations = pd.read_sql_query(query, db_connection)
stations.head()
# Another way to get the police station table
query = "SELECT * \
FROM police_stations"
cursor.execute(query)
data = cursor.fetchall()
colnames = [desc[0] for desc in cursor.description]
# create a data frame with column headers
stations = pd.DataFrame(data,columns = colnames).drop_duplicates()
stations
# store districts (excluding Headquarters) for later use
districts = stations.query("district != 'Headquarters'").district
districts
# Display the first 5 rows in the crimes table
query = "SELECT * \
FROM crimes"
chicago = pd.read_sql_query(query, db_connection)
chicago.head()
# Another way to get the crimes table
query = "SELECT * \
FROM crimes"
cursor.execute(query)
data = cursor.fetchall()
colnames = [desc[0] for desc in cursor.description]
# create a data frame with column headers
chicago = pd.DataFrame(data,columns = colnames)
chicago.columns
DATA DICTIONARY (CRIMES) (See https://www.kaggle.com/currie32/crimes-in-chicago)
id - Unique identifier for the record.
caseno - The Chicago Police Department RD Number (Records Division Number), which is unique to the incident.
date_of_occurence - Date when the incident occurred. this is sometimes a best estimate.
**block - The partially redacted address where the incident occurred, placing it on the same block as the actual address.
iucr - The Illinois Unifrom Crime Reporting code. This is directly linked to the Primary Type and Description. See the list of IUCR codes at https://data.cityofchicago.org/d/c7ck-438e.
primary_type - The primary description of the IUCR code.
description - The secondary description of the IUCR code, a subcategory of the primary description.
location-description - Description of the location where the incident occurred.
arrest - Indicates whether an arrest was made.
domestic - Indicates whether the incident was domestic-related as defined by the Illinois Domestic Violence Act.
beat - Indicates the beat where the incident occurred. A beat is the smallest police geographic area – each beat has a dedicated police beat car. Three to five beats make up a police sector, and three sectors make up a police district. The Chicago Police Department has 22 police districts. See the beats at https://data.cityofchicago.org/d/aerh-rz74.
district - Indicates the police district where the incident occurred. See the districts at https://data.cityofchicago.org/d/fthy-xz3r.
ward - The ward (City Council district) where the incident occurred. See the wards at https://data.cityofchicago.org/d/sp34-6z76.
community_area - Indicates the community area where the incident occurred. Chicago has 77 community areas. See the community areas at https://data.cityofchicago.org/d/cauq-8yn6.
fbi_cd - Indicates the crime classification as outlined in the FBI's National Incident-Based Reporting System (NIBRS). See the Chicago Police Department listing of these classifications at http://gis.chicagopolice.org/clearmap_crime_sums/crime_types.html.
xcoordinate - The x coordinate of the location where the incident occurred in State Plane Illinois East NAD 1983 projection. This location is shifted from the actual location for partial redaction but falls on the same block.
ycoordinate - The y coordinate of the location where the incident occurred in State Plane Illinois East NAD 1983 projection. This location is shifted from the actual location for partial redaction but falls on the same block.
year - Year the incident occurred.
updated_on - Date and time the record was last updated.
latitude - The latitude of the location where the incident occurred. This location is shifted from the actual location for partial redaction but falls on the same block.
longitude - The longitude of the location where the incident occurred. This location is shifted from the actual location for partial redaction but falls on the same block.
location - The location where the incident occurred in a format that allows for creation of maps and other geographic operations on this data portal. This location is shifted from the actual location for partial redaction but falls on the same block.
chicago.info()
chicago.head()
# use value_counts to see the values of the field
chicago['year'].value_counts()
# what is in the field domestic
chicago['domestic'].value_counts()
# let's look at how many crimes are domestic by primary type
pd.crosstab(chicago['primary_type'], chicago['domestic'], margins=True)
# percentages would be more informative
# ------ lambda expressions section 4.7.6 ('syntactic sugar for a normal function definition')
# https://docs.python.org/3/tutorial/controlflow.html
pd.crosstab(chicago['primary_type'], chicago['domestic']).apply(lambda x: x/x.sum(), axis = 1)
# Get the total number of crimes per district.
# Since each row corresponds to a crime, we just need to count the number of rows per district.
cursor.execute("SELECT district, count(district) \
FROM crimes \
GROUP BY district")
rows = cursor.fetchall()
# Create a dataframe from the results of the query.
crimes_per_district = pd.DataFrame(rows, columns=['dist_num','number_of_crimes'])
crimes_per_district['dist_num'] = crimes_per_district['dist_num'].astype(str)
crimes_per_district
# For each district, excluding HQ, we get the latitude/longitude of the police station at the district.
# With a few exceptions, will be inserting markers on the map at these police station locations
cursor.execute("""SELECT ST_X(ST_AsText(Where_IS)), ST_Y(ST_AsText(Where_IS)), district \
FROM police_stations \
WHERE district!='Headquarters'""")
police_stations = cursor.fetchall()
# Create a dataframe with this information.
station_locations = pd.DataFrame(police_stations,columns=['st_latitude','st_longitude','dist_num']).drop_duplicates()
station_locations
# Add columns for the coordinates of each distract. Again, this is going to be the locations of our markers.
query1_df = pd.merge(crimes_per_district, station_locations,on='dist_num')
query1_df
# Create a folium map centered on "downtown Chicago"
total_number_of_crimes_per_district_map = folium.Map(location = (41.8781, -87.6298),zoom_start = 11)
# Create a folium map centered on "downtown Chicago". The boundaries of each distrct are stored in "Boundaries.geojson".
# Use this data to draw the district regions on the folium map.
# Then shade each region "orange red" with the intensity of the color corresponding to the number of crimes.
# Remember the number of crimes per district is stored in the number_of_crimes column of crimes_per_district.
# We add this "choropleth layer" to our folium map.
folium.Choropleth(geo_data = "Boundaries.geojson",
fill_color = 'OrRd',
fill_opacity =0.5,
line_opacity= 1,
data = crimes_per_district,
key_on = 'feature.properties.dist_num',
columns = ['dist_num', 'number_of_crimes'],
legend_name = "CRIMES PER DISTRICT"
).add_to(total_number_of_crimes_per_district_map)
# We iterate over the rows in the dataframe creating one marker per row.
# The ("popup") label for each district marker displays the district number and number of crimes.
# Each distric marker is added to the map at the district station's location.
for index, row in query1_df.iterrows():
# location of police station
police_station_location = (row['st_latitude'],row['st_longitude'])
district = row['dist_num']
num_crimes = row['number_of_crimes']
# Create the marker and add it to the map.
folium.Marker(location = police_station_location,
popup =\
folium.Popup(html=f"District No : {district} has Total Number of Crimes: {num_crimes}",max_width=450)).\
add_to(total_number_of_crimes_per_district_map)
# Display the finished map.
total_number_of_crimes_per_district_map
Note: A crime is considered a violent crime if the PRIMARY_TYPE of the crimes is THEFT ,ASSAULT,ROBBER,KIDNAPPING,CRIM SEXUAL ASSAULT,BATTERY, or MURDER.
violent_crime_categories = 'THEFT','ASSAULT','ROBBERY','KIDNAPPING','CRIM SEXUAL ASSAULT','BATTERY','MURDER'
# Get the total number of violent crimes per district.
cursor.execute("SELECT district, count(district)\
FROM crimes \
WHERE PRIMARY_TYPE in %s \
GROUP BY district",[violent_crime_categories])
rows=cursor.fetchall()
violent_crime_data=pd.DataFrame(rows, columns=['dist_num','number_of_violent_crimes'])
violent_crime_data['dist_num'] = violent_crime_data['dist_num'].astype(str)
violent_crime_data
Now, lets create a dataframe of the different types of violent crimes for every district
cursor.execute(f"SELECT district, PRIMARY_TYPE, count(PRIMARY_TYPE) \
FROM crimes \
WHERE PRIMARY_TYPE in {violent_crime_categories} \
GROUP BY distrIct,PRIMARY_TYPE")
rows=cursor.fetchall()
violent_crime_type = pd.DataFrame(rows, columns=['dist_num','Description','number_of_violent_crimes'])
violent_crime_type['dist_num'] = violent_crime_type['dist_num'].astype(str)
violent_crime_type
# This time our dataframe will contain one row for each district and each type of violent crime.
# For each distrct number and type of violent crime our dataframe will have the total of violent crimes of that type.
# We once again include the stations location since that is where the markers will appear on the map.
query2_df = pd.merge(violent_crime_type, station_locations,on='dist_num')
query2_df
districts
Create a folium map centered on "downtown Chicago". The boundaries of each distrct are stored in "Boundaries.geojson". Use this data to draw the district regions on the folium map. Then shade each region "orange red" with the intensity of the color corresponding to the number of crimes. Remember the number of crimes per district is stored in the number_of_crimes column of crimes_per_district. We add this "chropleth layer" to our folium map.
# Create a folium map centered on "downtown Chicago". The boundaries of each distrct are stored in "Boundaries.geojson".
# Use this data to draw the district regions on the folium map.
# Then shade each region "orange red" with the intensity of the color corresponding to the number of crimes.
# Remember the number of crimes per district is stored in the number_of_crimes column of crimes_per_district.
# We add this "choropleth layer" to our folium map
violent_crimes_per_district_map= folium.Map(location =(41.8781, -87.6298),zoom_start=11)
folium.Choropleth(geo_data="Boundaries.geojson",
fill_color='YlOrRd',
fill_opacity=0.5,
line_opacity=1,
data = violent_crime_data, #data source changed from first example
key_on='feature.properties.dist_num',
columns = ['dist_num', 'number_of_violent_crimes'],
legend_name = "VIOLENT CRIMES PER DISTRICT"
).add_to(violent_crimes_per_district_map)
In addition, for each district find the block(s) that has the highest number of gun crimes in that district. Note that there might be a tie for the highest number of gun crimes. You need to find all such blocks. Add a popup marker (located at that district's police headquarter) that displays a DataFrame containing all such block along with the number of gun crimes for that block (i.e. the highest number of crimes for a district).
# We iterate over district numbers and get all the rows in query2_df with that district number.
# This new dataframe contains the number of crimes for each of violent crime types in a particular district.
# We convert this dataframe to "html" which will be displayed in marker label (along with the district number)
# Each distric marker is added to the map at the district station's location like before.
#districts = query2_df.dist_num.unique()
for district in districts:
df2d = query2_df[query2_df.dist_num == district] # new dataframe to add to marker.
police_station_location = tuple(df2d[['st_latitude','st_longitude']].values[0])
violent_crimes_per_district_df = df2d[['Description','number_of_violent_crimes']]
# Convert out new data frame to html before inserting it into the label.
header = violent_crimes_per_district_df.to_html(classes=
'table table-striped table-hover table-condensed table-responsive')
# Create the folium marker and add it to the map.
folium.Marker(location=police_station_location, popup=folium.\
Popup(html=f"District Number {district} - Violent Crimes {header}")).\
add_to(violent_crimes_per_district_map)
# Create the marker and add it to the map.
folium.Marker(location = police_station_location,
popup =\
folium.Popup(html=f"District No : {district} - Violent Crimes {header}",max_width=450)).\
add_to(violent_crimes_per_district_map)
# Display the finished map.
violent_crimes_per_district_map
Note: A crime is considered a gun related violent crime if the word "gun" is contained in the DESCRIPTION and the PRIMARY_TYPE of the crimes is THEFT ,ASSAULT,ROBBER,KIDNAPPING,CRIM SEXUAL ASSAULT,BATTERY, or MURDER.
Lets first create a dataframe of gun crimes per district first to get an idea about the number of gun crimes per district
# isolate violent gun crime
gun='%GUN%'
cursor.execute(f"SELECT district, count(district)\
FROM crimes\
WHERE PRIMARY_TYPE in {violent_crime_categories}\
AND DESCRIPTION::text LIKE '{gun}' GROUP BY district")
districts_violent_gun_crimes = cursor.fetchall()
districts_violent_gun_crimes_df = pd.DataFrame(districts_violent_gun_crimes, columns=['dist_num','violent_gun_crimes'])
districts_violent_gun_crimes_df['dist_num'] = districts_violent_gun_crimes_df['dist_num'].astype(str)
districts_violent_gun_crimes_df.head()
Now, lets create a dataframe of the different types of violent gun crimes for every district
cursor.execute(f"SELECT district, DESCRIPTION, count(DESCRIPTION) \
FROM crimes \
WHERE PRIMARY_TYPE in {violent_crime_categories} AND DESCRIPTION::text LIKE '{gun}' \
GROUP BY district, DESCRIPTION")
rows=cursor.fetchall()
violent_gun_crime_type = pd.DataFrame(rows, columns=['dist_num','Description','number_of_violent_gun_crimes'])
violent_gun_crime_type['dist_num'] = violent_gun_crime_type['dist_num'].astype(str)
violent_gun_crime_type.head()
query3_df = pd.merge(violent_gun_crime_type, station_locations, on = 'dist_num')
query3_df.head()
# Create a folium map centered on "downtown Chicago". The boundaries of each distrct are stored in "Boundaries.geojson".
# Use this data to draw the district regions on the folium map.
# Then shade each region "orange red" with the intensity of the color corresponding to the number of crimes.
# Remember the number of crimes per district is stored in the number_of_crimes column of crimes_per_district.
# We add this "choropleth layer" to our folium map.# Create a folium map centered on "downtown Chicago"
districts_violent_gun_crimes_map = folium.Map(location =(41.8781, -87.6298),zoom_start=11)
folium.Choropleth(geo_data="Boundaries.geojson",
fill_color='YlOrRd',
fill_opacity=0.5,
line_opacity=1,
data = districts_violent_gun_crimes_df,
key_on='feature.properties.dist_num',
columns = ['dist_num', 'violent_gun_crimes'],
legend_name="VIOLENT GUN CRIME"
).add_to(districts_violent_gun_crimes_map)
# districts = query3_df.dist_num.unique()
for district in districts:
df3d = query3_df[query3_df.dist_num == district]
police_station_location = tuple(df3d[['st_latitude','st_longitude']].values[0])
violent_gun_crimes_per_district_df = df3d[['Description','number_of_violent_gun_crimes']]
header = violent_gun_crimes_per_district_df.to_html(classes=
'table table-striped table-hover table-condensed table-responsive')
folium.Marker(location=police_station_location, popup=folium.\
Popup(html=f"District Number {district} - Violent GUN Crimes {header}")).\
add_to(districts_violent_gun_crimes_map)
folium.Marker(location = police_station_location,
popup =\
folium.Popup(html=f"District No : {district} - Violent Gun Crimes {header}",max_width=450)).\
add_to(districts_violent_gun_crimes_map)
districts_violent_gun_crimes_map
district = []
tarea=[]
with open('Boundaries.geojson') as f:
data = json.load(f)
a = data['features'] # a is a list of district data (dictionaries)
for i in range(len(a)):
obj = a[i]['geometry'] # list of coordinates defining the ith district boundary
n = a[i]['properties'] # dictionary with district number and district label keyes for ith district
district.append(n['dist_num']) # add district number to the district list
tarea.append(area(obj)/10000) # add the the area (in hectares) to area list
af = pd.DataFrame({'dist_num': district,'district_area_inHectares':tarea})
af['dist_num'] = af['dist_num'].astype(str)
final_data= pd.merge(af, crimes_per_district, on='dist_num', how='inner')
final_data['crime_density'] = round(final_data['number_of_crimes']/(final_data['district_area_inHectares']/100))
final_data
# isolate gun crime
gun='%GUN%'
cursor.execute(f"SELECT district, count(district)\
FROM crimes\
WHERE DESCRIPTION::text LIKE '{gun}' GROUP BY district")
districts_gun_crimes = cursor.fetchall()
districts_gun_crimes_df = pd.DataFrame(districts_gun_crimes, columns=['dist_num','gun_crimes'])
districts_gun_crimes_df['dist_num'] = districts_gun_crimes_df['dist_num'].astype(str)
districts_gun_crimes_df.head()
# This time our dataframe will contain one row for each district and each block in the district.
# Each row contains information about a particular gun crime including the block in which the crime occured.
# We also save the location of the crime incidendent and whether there was an arrest.
# We will use this information when creating the markers.
# In particular, markers will be positioned at the crime location instead of police station location.
data= []
for district in districts:
cursor.execute("""SELECT DISTINCT ON(caseno) caseno, block, DESCRIPTION, count(arrest), arrest,latitude,longitude\
FROM crimes\
WHERE district = %s and DESCRIPTION::text LIKE %s \
GROUP BY caseno, block, DESCRIPTION,arrest, latitude, longitude""",[district,gun])
results = cursor.fetchall()
# insert the district number as the first element of each result list
gun_crimes_per_district = [[district]+list(result) for result in results]
data += (gun_crimes_per_district)
gun_crimes_per_district_df = pd.DataFrame(data, columns=['dist_num','caseno','block',\
'Description','arrest_count', 'arrest', 'latitude', 'longitude'])
gun_crimes_per_district_df['dist_num'] = gun_crimes_per_district_df['dist_num'].astype(str)
gun_crimes_per_district_df.head()
# Create a folium map centered on "downtown Chicago". The boundaries of each distrct are stored in "Boundaries.geojson".
# Use this data to draw the district regions on the folium map.
# Then shade each region "orange red" with the intensity of the color corresponding to the number of crimes.
# Remember the number of crimes per district is stored in the number_of_crimes column of crimes_per_district.
# We add this "choropleth layer" to our folium map.
gun_crime_arrests_map = folium.Map(location =(41.8781, -87.6298),zoom_start=11)
folium.Choropleth(geo_data="Boundaries.geojson",
fill_color='YlOrRd',
fill_opacity=0.5,
line_opacity=1,
data = districts_gun_crimes_df, #using data from Query 3
key_on='feature.properties.dist_num',
columns = ['dist_num', 'gun_crimes'],
legend_name="GUN CRIME"
).add_to(gun_crime_arrests_map)
marker_cluster = MarkerCluster().add_to(gun_crime_arrests_map)
# We create a marker for each gun crime--"green" if there was an arrest and "red" otherwise.
gun='%GUN%'
for district in districts:
gun_crimes_per_district_df2 = gun_crimes_per_district_df[gun_crimes_per_district_df.dist_num==district]
for index, row in gun_crimes_per_district_df2.iterrows():
description = row['Description']
block = row['block']
loc = (row['latitude'],row['longitude'])
# print("District No: %s <br> Description: %s <br> Block: %s" %(district,description,block))
if row['arrest']==True:
folium.Marker(location=loc,popup = folium.Popup(html=\
f"District No: {district} <br> Description: {description} <br> Block: {block}"),\
icon=folium.Icon(color='green', icon='ok-sign'),).add_to(marker_cluster)
# print(f"District {district} green marker at {loc}")
else:
folium.Marker(location=loc, popup = folium.Popup(html=
f"District No: {district} <br> Description: {description} <br> Block: {block}"),\
icon=folium.Icon(color='red',icon='remove-sign'),).add_to(marker_cluster)
gun_crime_arrests_map
Locate the farthest gun crime from the police station in every district. Create a Choropleth map where the districts are shaded in proportion to the number of gun crimes in that district. For each district, find the gun crime that was farthest from police station. Add a pop-up on the Choropleth map to display the district number and the Block where the farthest gun crime occurred. Also add circle marker (of radius 5) at the location of the farthest gun crime.
# This complex query returns for each block in each district the gun crime which occured furthest
# from the police station. We also store this distance together with the police station location.
# Once again we will be positioning the markers at the police station locations.
results= []
for district in districts:
cursor.execute("""SELECT DISTINCT on (A.block) A.district,A.block, A.where_is,ST_Distance(A.where_is,B.where_is)\
FROM crimes as A, police_stations as B
WHERE A.district=%s and DESCRIPTION::text LIKE %s and B.district= %s
and ST_Distance(A.where_is,B.where_is)
IN
( SELECT max(dist)
FROM
(SELECT ST_Distance(A.where_is,B.where_is) as dist
FROM crimes as A, police_stations as B
WHERE A.district=%s and DESCRIPTION::text LIKE %s and B.district= %s ) as f)""",
[district,gun,district,district,gun, district])
farthest_block_gun_crime = list(cursor.fetchall()[0])
cursor.execute(f"SELECT ST_X(ST_AsText(%s)), ST_Y(ST_AsText(%s))"
,(farthest_block_gun_crime[2],farthest_block_gun_crime[2]))
farthest_block_gun_crime[2:3] = cursor.fetchall()[0]
results.append(farthest_block_gun_crime)
farthest_gun_crime_df = pd.DataFrame(results,columns=['dist_num','block','crime_lat','crime_long','dist'])
farthest_gun_crime_df['dist_num']= farthest_gun_crime_df['dist_num'].astype(str)
query6_df = pd.merge(farthest_gun_crime_df, station_locations,on='dist_num')
query6_df.head()
# Create a folium map centered on "downtown Chicago". The boundaries of each distrct are stored in "Boundaries.geojson".
# Use this data to draw the district regions on the folium map.
# Then shade each region "orange red" with the intensity of the color corresponding to the number of crimes.
# Remember the number of crimes per district is stored in the number_of_crimes column of crimes_per_district.
# We add this "choropleth layer" to our folium map.
farthest_block_gun_crime_map = folium.Map(location =(41.8781, -87.6298),zoom_start=11)
folium.Choropleth(geo_data="Boundaries.geojson",
fill_color='YlOrRd',
fill_opacity=0.5,
line_opacity=1,
data = districts_gun_crimes_df,
key_on='feature.properties.dist_num',
columns = ['dist_num', 'gun_crimes'],
legend_name="GUN CRIME"
).add_to(farthest_block_gun_crime_map)
# This time we place a circle markers at the crime locations in addition to the "regular" markers at the
# police station locations.
for district in districts:
data = query6_df.loc[query6_df.dist_num==district]
# print(data)
# print(data['block'].values)
block, crime_lat, crime_long, dist, police_lat, police_long =\
data['block'].values[0], data['crime_lat'].values[0], data['crime_long'].values[0],\
data['dist'].values[0], data['st_latitude'].values[0], data['st_longitude'].values[0]
folium.CircleMarker((crime_lat,crime_long),radius=5,color='#ff3187',
popup=folium.Popup(html=f"District No.: {district} <br> Block:{block}")).\
add_to(farthest_block_gun_crime_map)
folium.Marker(location=(police_lat,police_long),
popup=folium.Popup(\
html=f"Police Station <br> District No.: {district} <br> Farthest Gun_Crime Block:{block}")).\
add_to(farthest_block_gun_crime_map)
farthest_block_gun_crime_map
cursor.execute("SELECT district, ward, arrest, primary_type, location_description \
FROM crimes")
new = cursor.fetchall()
new_df = pd.DataFrame(new, columns=['district', 'ward', 'arrest', 'primary_type', 'location_description'])
new_df['district'] = new_df['district'].astype(str)
new_df.info()
new_df.head()
pd.crosstab(new_df['primary_type'], new_df['arrest']).apply(lambda x: x/x.sum(), axis = 1)
This indicates that 80.66% of assaults did NOT result in an arrest whereas the remaining 19.34% of assaults DID result in an arrest.
gun='%GUN%'
cursor.execute(f"SELECT district, count(district)\
FROM crimes\
WHERE DESCRIPTION::text LIKE '{gun}' GROUP BY district")
districts_gun_crimes = cursor.fetchall()
districts_gun_crimes_df = pd.DataFrame(districts_gun_crimes, columns=['dist_num','gun_crimes'])
districts_gun_crimes_df['dist_num'] = districts_gun_crimes_df['dist_num'].astype(str)
#districts_gun_crimes_df.head()
#--------------------------
district = []
tarea=[]
with open('Boundaries.geojson') as f:
data = json.load(f)
a = data['features'] # a is a list of district data (dictionaries)
for i in range(len(a)):
obj = a[i]['geometry'] # list of coordinates defining the ith district boundary
n = a[i]['properties'] # dictionary with district number and district label keyes for ith district
district.append(n['dist_num']) # add district number to the district list
tarea.append(area(obj)/10000) # add the the area (in hectares) to area list
af = pd.DataFrame({'dist_num': district,'district_area_inHectares':tarea})
af['dist_num'] = af['dist_num'].astype(str)
final_data= pd.merge(af, districts_gun_crimes_df, on='dist_num', how='inner')
final_data['crime_density'] = round(final_data['gun_crimes']/(final_data['district_area_inHectares']/100))
final_data
# isolate violent gun crime
gun='%GUN%'
cursor.execute(f"SELECT district, count(district)\
FROM crimes\
WHERE DESCRIPTION::text LIKE '{gun}' GROUP BY district")
districts_violent_gun_crimes = cursor.fetchall()
districts_violent_gun_crimes_df = pd.DataFrame(districts_violent_gun_crimes, columns=['dist_num','gun_crimes'])
districts_violent_gun_crimes_df['dist_num'] = districts_violent_gun_crimes_df['dist_num'].astype(str)
districts_violent_gun_crimes_df.head()
cursor.execute(f"SELECT district, block, count(*) \
FROM crimes \
WHERE DESCRIPTION::text LIKE '{gun}' \
GROUP BY district, block")
rows=cursor.fetchall()
violent_gun_crime_type = pd.DataFrame(rows, columns=['dist_num','block','number_of_gun_crimes'])
violent_gun_crime_type['dist_num'] = violent_gun_crime_type['dist_num'].astype(str)
violent_gun_crime_type['count_max'] = violent_gun_crime_type.groupby(['dist_num'])['number_of_gun_crimes'].transform(max)
violent_gun_crime_type = violent_gun_crime_type[violent_gun_crime_type['count_max'] == violent_gun_crime_type['number_of_gun_crimes']]
violent_gun_crime_type.sort_values(by=['count_max'], ascending=False).head()
The 4 blocks with the most gun crimes each have 3 gun crimes.
query3_df = pd.merge(violent_gun_crime_type, station_locations, on = 'dist_num')
query3_df.head()
# Create a folium map centered on "downtown Chicago". The boundaries of each distrct are stored in "Boundaries.geojson".
# Use this data to draw the district regions on the folium map.
# Then shade each region "orange red" with the intensity of the color corresponding to the number of crimes.
# Remember the number of crimes per district is stored in the number_of_crimes column of crimes_per_district.
# We add this "choropleth layer" to our folium map.# Create a folium map centered on "downtown Chicago"
districts_violent_gun_crimes_map = folium.Map(location =(41.8781, -87.6298),zoom_start=11)
folium.Choropleth(geo_data="Boundaries.geojson",
fill_color='YlOrRd',
fill_opacity=0.5,
line_opacity=1,
data = violent_gun_crime_type,
key_on='feature.properties.dist_num',
columns = ['dist_num', 'number_of_gun_crimes'],
legend_name="GUN CRIME PER DISTRICT"
).add_to(districts_violent_gun_crimes_map)
# districts = query3_df.dist_num.unique()
for district in districts:
df3d = query3_df[query3_df.dist_num == district]
police_station_location = tuple(df3d[['st_latitude','st_longitude']].values[0])
violent_gun_crimes_per_district_df = df3d[['block','number_of_gun_crimes']]
header = violent_gun_crimes_per_district_df.to_html(classes=
'table table-striped table-hover table-condensed table-responsive')
folium.Marker(location = police_station_location,
popup =\
folium.Popup(html=f"District No : {district} - Block(s) with the Most Gun Crimes {header}",max_width=450)).\
add_to(districts_violent_gun_crimes_map)
districts_violent_gun_crimes_map
# isolate gun crime
gun='%GUN%'
cursor.execute(f"SELECT district, count(district)\
FROM crimes\
WHERE DESCRIPTION::text LIKE '{gun}' GROUP BY district")
districts_gun_crimes = cursor.fetchall()
districts_gun_crimes_df = pd.DataFrame(districts_gun_crimes, columns=['dist_num','gun_crimes'])
districts_gun_crimes_df['dist_num'] = districts_gun_crimes_df['dist_num'].astype(str)
districts_gun_crimes_df.head()
# This time our dataframe will contain one row for each district and each block in the district.
# Each row contains information about a particular gun crime including the block in which the crime occured.
# We also save the location of the crime incidendent and whether there was an arrest.
# We will use this information when creating the markers.
# In particular, markers will be positioned at the crime location instead of police station location.
data= []
for district in districts:
cursor.execute("""SELECT DISTINCT ON(caseno) caseno, block, DESCRIPTION, count(location_description), location_description,latitude,longitude\
FROM crimes\
WHERE district = %s and DESCRIPTION::text LIKE %s \
GROUP BY caseno, block, DESCRIPTION,location_description, latitude, longitude""",[district,gun])
results = cursor.fetchall()
# insert the district number as the first element of each result list
gun_crimes_per_district = [[district]+list(result) for result in results]
data += (gun_crimes_per_district)
gun_crimes_per_district_df = pd.DataFrame(data, columns=['dist_num','caseno','block',\
'Description','location_description_COUNT', 'location_description', 'latitude', 'longitude'])
gun_crimes_per_district_df['dist_num'] = gun_crimes_per_district_df['dist_num'].astype(str)
gun_crimes_per_district_df.head()
# Create a folium map centered on "downtown Chicago". The boundaries of each distrct are stored in "Boundaries.geojson".
# Use this data to draw the district regions on the folium map.
# Then shade each region "orange red" with the intensity of the color corresponding to the number of crimes.
# Remember the number of crimes per district is stored in the number_of_crimes column of crimes_per_district.
# We add this "choropleth layer" to our folium map.
gun_crime_arrests_map = folium.Map(location =(41.8781, -87.6298),zoom_start=11)
folium.Choropleth(geo_data="Boundaries.geojson",
fill_color='YlOrRd',
fill_opacity=0.5,
line_opacity=1,
data = districts_gun_crimes_df, #using data from Query 3
key_on='feature.properties.dist_num',
columns = ['dist_num', 'gun_crimes'],
legend_name="GUN CRIME"
).add_to(gun_crime_arrests_map)
marker_cluster = MarkerCluster().add_to(gun_crime_arrests_map)
# We create a marker for each gun crime--"green" if there was an RESIDENCE and "red" STREET.
gun='%GUN%'
for district in districts:
gun_crimes_per_district_df2 = gun_crimes_per_district_df[gun_crimes_per_district_df.dist_num==district]
for index, row in gun_crimes_per_district_df2.iterrows():
description = row['Description']
block = row['block']
x = row['location_description']
loc = (row['latitude'],row['longitude'])
# print("District No: %s <br> Description: %s <br> Block: %s" %(district,description,block))
if row['location_description']=="RESIDENCE":
folium.Marker(location=loc,popup = folium.Popup(html=\
f"District No: {district} <br> Description: {description} <br> Block: {block} <br> Location: {x}"),\
icon=folium.Icon(color='green', icon='ok-sign'),).add_to(marker_cluster)
# print(f"District {district} green marker at {loc}")
elif row['location_description']=="STREET":
folium.Marker(location=loc, popup = folium.Popup(html=
f"District No: {district} <br> Description: {description} <br> Block: {block} <br> Location: {x}"),\
icon=folium.Icon(color='red',icon='remove-sign'),).add_to(marker_cluster)
else: pass
gun_crime_arrests_map